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ENHANCING SOURCE CODING SYSTEMS BY ADAPTIVE TRANSPOSITION 
TECHNICAL BIEU> 

The present invention relates to a new method for enhancemeat of source coding systems using h jgh- 
5 frequency reconstruction. The arvention teaches that tonal signals can be classified as either pulse-train- 
like or nan-pulse-train-like. Relying on this classification, significant improvements On the perceived 
audio quality can be obtained by adaptive switching of transposers- The invention shows that the SO- 
switched transposers must have fimdamental differences in their characteristics. 

10 

BACKGROUND OF THE INVENTION 

In ''Source CodiQg Enhancement iising Spectral-Band Replication" [WO 98/57436], transposition was 
defined and established as an efBdent means for high ftequency generation to be used in a HFR (High 
Frequency Reconstruction) based codec. Several transposer implementations were described. Howeve 
1 5 apart firom a brief discussion on transient response improvements, progrannue dependent adaptation of 
fundamental transposer characteristics was not elaborated upon. 



SUMMARY OF THE INVENTION 

20 The present invention teaches that tonal passages, i,e. excerpts dominated by contributions from pitche« 
instruments, can be characterised as "pulse-train-like" or "non-pulse-train-likc", A typical example of ■ 
former is the human voice in case of vowels, or a single pitched instrument, such as tnunpet, where the 
"excitation signal" can be modelled as a "pulse-train". The latter is tte case where several different 
pitches axe combined, and thus no single pulse-train can be identified. According to fiie present 

25 invention, the HER perfonnance can be significantiiy in^roved, by discriminating between the above t\ 
cases, and adapting the transposer properties correspondingly. 

Wten a pulse-train-likB passage is detected, the transposer shall preferably operate on a per-pulse basis 
Here, the decoded lowband, serving as the input signal to the transposer, can be viewed as a series of • 

30 impulse responses A(n) of lowpass character with cut off frequency fg , separated by a period Tp. This 
corresponds to a Fourier series with fundamental frequency 1/Tp, containing harmonics at all integer 
multiples of 1 / 2}, up to the frequency fc- The objective of the fransposer is to increase the bandwidth 
the mdividual responses A(«) up to the desired bandwidth Nfc where N is the transposition factor, 
without altering the period Tp. Since the pulse period is preserved, the transposed signal stiU correspo 

35 to a Fourier series with fundamental 1/7},, now containing all parttals vsp to Nfc- Hence this method 
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provides a perfect continoation to the tnmcated Fourier series of the lowband. Some prior art methods 
satisfir the requirement of preservation of the pulse pedod. Examples are fiequency translation, and FD- 
transposition according to [WO 98/57436], where the window is selected short enoagh not to contain 
more than one period, i.e. length(wmdow) <, Tp. Neither of those implementations handle material with 
5 multiple pitches well, and only the FD-transposilion provides a perfect continuation to the truncated 
Fourier series of the lowband. 

When a non-pulse-train-like passage is detected e.g. when multiple pitches are at hand, the deoiands on 
the transposer instead shifts from preservation of pulse periods to preservation of integer relationshq3S 

1 0 between lowband harmonics and generated higher partials. This requirement is met by the FD- 

transposition methods in [WO 9S/57436], where the window is selected long enough float many periods 7 
of the individual pitches fonning the sequence are contained within one window, i.e. length(windowJ » 
Ti- Hereby airy truncated Fourier series [fh 2fi, syj , , . .] in the transposer source firequency range is 
transposed to [iV/i, 2Nfi, "iNfi,.. .]. where ^is the integer tran^osition factor. Clearly, ias opposed 

15 to the above per-pulse operation, this scheme docs not generate a full continuation of the lowband Fouxic 
series. This is tolerable for multi pitched signals, but not ideal for the single pitch pulse-train-like case. 
Thus, this ttansposition mode is preferably only used in non-piilse-train-like cases. 

According to the present invention, discrimination between pulse-like and non-pulse-like signals can be 
20 performed in the encoder, and a corresponding control signal sent to the decoder. Alternatively, the 

detection can be done in the decoder, eliminating the need fer control signals but at an eaqfjense of higher 
decoder conqjlexity. Examples of detector principles arc transient detection in the time domain, as well 
as peak-picking in the frequency domain. The decoder includes means for the necessary transposer 
adaptation. As an example, a system laiug frequency translation for the pulse-train-like case, and a long 
25 window FD transposer for the non-pulse train-like case, is described. The actual switching or cross 
feding between transposers is preferably performed in an envelope-adjusting filterbank. 

The present invention comprises the following features: 

- Adaptively over time selecting different methods for high frequency generation, based on whether 
30 the signal being processed has a pulse-train-like characta: or a non-pulse-irain-like character. 

- the selection is done based on analysis by peak-picking in a time- and firequencyHiomain 
representation of the signal. 

- the different methods for high frequency generation are frequency translation and FD transposition, 
or 

35 - the different methods for high freqiiency generation are FD transposition with different window size 
or 



3 



the different methods for high frequency generation are time-domaiji pulse train transpositicmL and FT, 
transposition. 

5 BBBEF DESCRIPTION OF THE DRAWINGS 

The present invention will now be described by way of illustrative examples, not limiting the scope or 
spirit of the iiivention, with reference to the accompanying drawings, in which: 

Fig. la illustrates aa input pulse-train signal x(n) . 
10 Fig. lb illustrates lie magnitude spectrum | X(f) | of the signal x{n) . 

Fig. 2a illustrates the impulse respome ^^(m) of a FIR filter. 

Fig. 2b illnstrates the magnitude spectrum | H^, (/) | of the FIR filter. 

Fig. 3a illustrates a signal (n) = x(fi) * A^, (n) . 

Fig. 3b illustrates the magnitude spectrum 1 ! of the signal («) . 
1 5 Fig, 4a illustrates the decimated impulse response A, («) of a FIR filter. 

Fig. 4b iUiistrates the magnitude spectrum i Hi if) \ of the decimated FIR filter. 

Fig. 5a illustrates the transposed signal (h) . 

Fig. 5b illustrates the magnitude spectram 1 1^ (/) | of the signal y^iri) . 

Fig. 6 illustrates the magnitude spectrum | (/) | , after FD-transposition with a long window of th. 
20 signal x(n) . 

Fig. 7 illustrates an implementation of the presait invention on the decoder side. 

DESCRIPTION OF PREFERRED EMBODIMENTS 

25 The below-described embodiments are merely illustrative for the principles of the present invention for 
adaptive transposer switching for HFR systems. It is understood that modifications and variations of the 
arrangements and the details described herein will be apparent to others skilled in the art. It is the intent 
therefore, to be limited only by the scope ofthe impending patent claiios and not by the specific details 
presented by way of description sad explanation ofthe embodiments herein. 

30 

"Ideal transposition" of a single pitched pulst:-train-Hke signal can be defined by means of a simple 
model. Let the original signal be a sum of diracs S{n) , separated by m sainples, t.e. a pulse-train 
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Fig. la shows x(n), and Fig- lb the coiresponding magnitude spectnun | X{f) \ . Clearly | X(f) \ 
corresponds to a of a Fourier series with fmidamental ^ / /«, wiere^ is the san^ling frequency. Let y(n 
be a low-pass filtered versicai o?x(n), where the low-pass FIR filter has the hnpulse response ho(n} of 
length p such that p < ot, see Figs. 2a and 2b for the time and frequency domain representation 
respectively. The filter cut-off frequency is /c- The output signal is lien given by 

00 OO 
1=-<X> / = -00 

i.e. a series of impulse responses, separated by m sanqiles. Figs 3a and 3b show («) and 1 Yq (/) | . 
The original FoutIct series has effectively been tnmcated at the frequencyT^. Assume that a time domain 
based transposer is able to detect the individual impulse responses k^in — im) , and that tibose signals an 
decimated by a factor 2, i.e. every second sample is fed to the output. The discarded sajuples are 
con^jensated for by insertion of zeroes between the shorter responses \{n — Im) , in order to preserve tt 
length of the signal. The decimated impulse response h^(n) and die corresponding frequency 
representation ! (/) 1 are shown in Figs 4a and 4b. Obviously, the narrowing of the time domain 
signal corresponds to a widening of the frequency domain signal, in this case by a factor 2. Finally, the 

OO 

transposed signal yi(n)- ^ A (" ~ ^ ^) I ^ (/) I shown if Figs 5a and 5h. The bandwidth o: 

the LP filtered pulse-train has been increased, while preserving the correct time, and thereby also 
frequency, properties. The output signal corresponds to a Fourier series wiflipartials reaching uf 
to the frequency 2^. 

The above transposition can be aprproximated in several ways. One approach is to use a frequency dotna 
transposer (FD-transposer) such as the STFT transposer described in [WO 98/57436], but with different 
window sizes, i.e, a ishoit window is used for pulse-train signals, and a long wuidow is used for all other 
signals. The short window (of lengA <m in the above sxampls) ensures that the transposer operates on 
per pulse basis, giving the desired pulse transposition outlined above. A different approach for pulse 
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transposition is using singie-side-band modulation. This ensures that the period time between the pulses 
Tp is CQirect, however, fee generated psirtials are not harmonically related to the partials of the lowbimd. 
It should also be pointed out that differeat pulse-train transposition algorithms may perform diffetently 
for dijfferent program material. Therefore several pulse-train tcansposers could be used with suitable 
5 detection algorithms, in the encoder and/or the decoder, to ensure optimal perfonnanoe. 

For the pulse-train signal used in 'Qie example above, an implementatian with a FD-transposition method 
using a long window will give unsatisfectory results. This is due to the following: 
When using a long window (of lengfti » m) in the FD-tranSposition method, the following relation 
10 applies: 

N-\ N-1 

where uin) is the input, v(n) is the output, M is the transposition factor, iV is the number of sinusoids,^ 
e,(n), a i are the individual input frequencies, time envelopes and phase constants respectively, are tl 

15 arbitraiy output phase constants and^ is the sampling ftequency, and 0 < Mfi < fjl^ The input signal 
a(n) will using the relation in Eq. 3 yield an output signal y^{ji) with a magnitude spectrum | Y^{f) \ 
according to Fig. 6, where the partials of {ti) are harmonically related to the partials of :i(b). Howevs 
the distance between them has uicreascd according to itoe transposition fector, i.e, the pitch of the signal 
has increased by the transposition factor. Wlien adding this new highband signal to the original lowbant 

20 signal, the two different pitches can clearly be discriminated. This causes for instance speech signals to 
sound as if an additional speaker was speaking simultaneously but at a higher pitch, i.e. a so caUed gho 
voice occurs. 

However, as soon as the input signal does not display single-pitched pulse-train characteristics, a pulse 
25 transposition is not applicable if high-quality HFR is required. Thus it is highly desirable to detect whic 
transposition method that gives the best result at a given time, in order to optimise performance of the 
HFR system. 

In order to benefit from the different transposition characteristics in a decoder it is necessary to, in the 
30 encoder and/or the decoder, asses which transposition method will give the best results at a given time. 
There are several ways to detect pulsc-train-like characteristics in a signal, it can be done in either the 

tiina-domaiii or in the firequency domain. If a pulse train has a period time Tp the pulses will be sSpara 
in time by that period time and the firequency components will be 1 / 2}, apart. Hence if Tp is high, i.e. e 
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low-pitched pulse-train, this is preferably detected in the time domain since the pulses are relatively far 
apart and thus easy to discriminate. However, ifTp is low, this corresponds to a high-pitched pulse-traia 
and hence it is more easily detected in the frequency domain. For time domain detection it is preferable t 
spectrally whiten the signal in order to obtain an as pulse trsin like character as possible for easier 
5 detection. The detection schemes in the time domain and the frequency domain are similar. They are 
based on peak picking and statistical analysis of the distances between picked peaks. la the titne domain, 
the peak-picking is do»e by coiig>aring ihe energy and peak level of the signal before and after £in 
arbitrary poiat, thus searching for transient behaviour in the signal. Jh tiie frequency domain the peak 
detection i$ done on the harmonic product spectrum, which is a good indication if a strong harmonic 
10 series is preseat. The distances between the detected pitches are presented in a histogram nipon which the 
detection is made by cornpaiing the ratio between pitch-related entries and non-pitch related entries. 

The implemientation exemplified in Fig. 7 shows the usage of two different types of transposition methoi 
in the same decoder system - the types being a FD transposer using a long window and a frequency 

15 translating device [TCT/SEOI/OUSO], The demultiplexer 701 unpacks the bitstream signal and feeds it t 
an arbitrary baseband decoder 702. The output from the baseband decoder, i.e. a bandwidth-limited audi 
signal, is fed to an analysis filterbank 703, which splits the audio signal into spectral bands. The audio 
signal is simultaneously fed to an FD-transposer unit 705. The output therefrom is fed to an additional 
analysis filterbank 706, which is of the same type as the filteibank unit 703 . The data from the filterbank 

20 unit 703 is patched 704 according to the principles of frequency translating devices and fed to the mixin; 
unit 707 together with the output from the analysis filterbank 706. The mixing unit blends the data 
according to the control signal transmitted from the encoder or control signals obtained by die decoder. 
The blended spectral data is subsequently envelope adjusted in the envelope adjuster 708, using data anc 
control signals sent in the bitstneam. The spectral-adjusted signal and the data from the analysis filterban 

25 703 are fed to a synthesis filterbank unit 709, thus creating an envelope adjusted wideband signal. FinalJ 
the digital wideband signal is converted 710 to an analogue output signal. 



